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Abstract 



This paper gives an online algorithm for generating Jakobsson's fractal hash chains |14j . 
Our new algorithm compliments Jakobsson's fractal hash chain algorithm for preimage 
traversal since his algorithm assumes the entire hash chain is precomputed and a particular 
list of [logn] hash elements or pebbles are saved. Our online algorithm for hash chain 
traversal incrementally generates a hash chain of n hash elements without knowledge of n 
Pm I before it starts. For any n, our algorithm stores only the [logn] pebbles which are precisely 

■ the inputs for Jakobsson's amortized hash chain preimage traversal algorithm. This compact 

tyj I representation is useful to generate, traverse, and store a number of large digital hash chains 

O . on a small and constrained device. We also give an application using both Jakobsson's and 

our new algorithm applied to digital chains of custody for validating dynamically changing 
forensics data. 

> 
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^ ; 1 Introduction 

This paper proposes a digital forensics system for proactively capturing time-sensitive digital 
^ ' forensics evidence. Intuitively, we are trying to build fully digital chains of custody for digital 

• evidence that provide forensics investigators with opportunities to enhance their data's integrity. 

Perhaps this system is best applied to monitor suspects that have already been identified. In 
some cases, it is important to capture and validate dynamic evidentiary data. This may range 
from tracking a very fast virus infection to tracking more slow changes directly made by human 
touch. Even in monitoring slow changes to a file, capturing and validating an inappropriate 
email-as it is generated-is far more convincing than just getting a snapshot of it retrospectively. 

Generally, there is a web-of-trust for validating classical forensic evidence. This web may 
include both witness and expert testimony as well as logical decuction given basic facts about a 
situation. The evidence in this web is held together by a chain of custody. A chain of custody 
is careful documentation of the evidence including details of all transfers of its possession for 
examination. A chain of custody is used to authenticate evidentiary exhibits as well as to verify 
these exhibits have not been modified. 



*An extended abstract of this paper is to appear in the Intelligence and Security Informatics 2007 (ISI 2007) 
Conference. 
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Our proposed system also initially depends on a web-of-trust. In particular, we must trust 
the system administrators or even law enforcement for initiating our system under proper cir- 
cumstances. This may be fleshed out in the usual ways using witnesses, experts, and logical 
decuction given facts of the case. We maintain a full hash chain of the dynamic evidence for 
the following reasons: (1) to document and sychronize the dynamic and time-specific nature of 
change in a system, (2) to allow repeated verifications of the data under scrutiny, and (3) in the 
cases where the hash elements include diverse swaths of compressed data may give extensive 
opportunites for validation using webs-of-trust. 

Our system is based on carefully timed hash chains on small constrained tamper-resistant 
devices. These timed hash chains are constructed with compressed snapshots of evidence while 
using (suspected) one-way hash functions. Generally, the amount of data or the length of time 
for data capture is not known in advance for proactively monitoring forensics data. Thus, the 
constrained nature of the hardware, variable granularity of the data, and unknown: time and 
number of hash chains makes it likely that these devices cannot hold sufficient complete hash 
chains. Likewise, such devices cannot hold a large set of time-stamps with compressed digital 
finger-prints such as hashs. 

It is standard forensics practice to use at least three well-known hash functions to verify 
data integrity in case one or more of these functions is ever compromised [SlllTj. Such integrity 
checks may be used for verifying a disk's data. In our case, they may also be used to vali- 
date dynamic changes in a critical file. Alternatively, hash chains use deferred disclosure to 
establish time-synchronized authenticity. A hash chain can chronologically document evidence 
by generating a timed hash chain forward while including diverse data of interest in the hash 
inputs. Such evidence may be verified by traversing the hash chain backwards while supplying 
the correct inputs at the proper times. This may include data that may be logically deduced 
about data captured in the broader system itself. The backwards hash chain traversal allows re- 
peated verification by only going back in the hash chain as far as appropriate-leaving the other 
concealed hash elements in the chain for subsequent deferred verification. Data may be verified 
by both sides in a trial, in addition to the forensic investigator. To preserve data integrity, each 
time data is verified, each constrained device will release another hash element generated prior 
to all already verified hash elements. This deferred disclosure validates the known elements in 
the hash chain and it may tie in to different webs-of-trust. 

Another approach is to have each device hold the first and last hash element of a hash chain. 
Generally, a forensics verifier would start at the beginning of the hash chain and present the 
appropriate data to verify the entire hash chain or a large subset of the hash chain. However, 
such large subsets of the hash chain should include either the first or last element of the hash 
chain for validity and verification. 

Assuming the investigator or system administrator that initiated the hash chain-based data 
capture is trustworthy. The data then is in the domain of the evidence clerk maintaining the 
chain of custody. Subsequently, the hash chain evidence may be called in to question: is this 
the correct data? The first challenge to this approach is if the start (or an early element) of 
the hash chain is discovered by an attacker, then another fictitious constrained device may be 
created that 'verifies' incorrect or modified data. In evidence storage, replacing a gun with 
another with different serial number or bullet grooves is hard not to miss. That is, given the 
first element of a trusted hash chain gives an attacker an opportunity to generate and fake the 
rest of the hash chain, except for the last element. The last hash element may be at the center 
of the contention. If there is a single end element of the hash chain that disagrees, whose do we 
trust? Thus, we have three hash chains that only reveal their elements on-demand by deferred 
validation: the trusted investigator or system administrator can validate the first or early hash 
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Figure 1: Two Critical Time Boundaries to and t\ for Evidence E. Time increases from left to 
right and assume t\ — to niay be small, for instance as small as 10~^ of a second. 

elements. In fact, ideally there would be enough hash elements that they would not have to 
reveal the last hash element. This situation is reminicant of the motivation for Schneier and 
Kelsey [21] 's audit logs. 

Another possibility is to determine the first time step that verifies critical data. However, 
authenticating the first critical data item requires the prior hash element. If it is useful to 
validate this 'prior hash element,' we must traverse the hash chain back two hash elements from 
the critical data, etc. This is the second challenge to storing the first and last hash elements of 
a hash chain on a constrained device: repeatedly validating critical data each time the data is 
examined through deferred disclosure requires us to traverse the hash chain backwards. 

Computer forensics strives to understand the relationships between suspects and events so 
they can be verified by third parties such as jurors in a court of law. In such situations it is 
critical to establish strong credence of the validity of digital forensics evidence. Digital evidence 
is abstract, ephemeral, time-sensitive, compact, complex, and often encoded. In some cases, 
biological evidence has measurable decay characteristics that allow chronologic analysis |20j . 
Digital evidence does not have such measurable decay characteristics. Also, digital evidence is 
easily copied and copies may be readily manipulated to challenge valid evidence and diminish 
its credibility. 

The digital forensics system proposed here is applied to maintaining timed digital evidence. 
Figure [D illustrates a central application addressed in this paper. This challenge is the interval 
time-stamping problem ^ \TE[ I28|. Consider the left-to-right time line containing the time 
interval [to,ti]- A central challenge is to demonstrate that the evidence £ was in a particular 
user's possession between times to and ti. The value ti — to may be very small. It is well known 
that when using public key systems, along with prominent and well distributed data, it is easy 
to show £ existed after time ti. However, how does one show £ existed at or before time to? 
One approach is to divide the interval [to — , ti] into A; -|- 1 smaller intervals each of size 
. In each smaller interval [to,i,ti,i] it may be shown that £ existed after time ti^,, for each 
i : k > i > 1 where ti^o = Thus, we can show £ existed before time ti^k = ^o- 

Throughout this paper, all logs are base 2. 

1.1 Our Contributions 

This paper proposes constrained devices for securing and validating time-sensitive (and dy- 
namic) forensic data. It assumes tamper-resistant hardware, which may be viewed as a depen- 
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dency on a trusted third party. A special online algorithm is given here to prevent a modify- 
and-copy attack. This online algorithm allows computer forensics specialists to maintain the 
verifiability of timed digital evidence. 

Take an easy to compute hash function h. Assume h is one-way jTU) : So, on average it is 
intractable to invert h. That is, given y where y ^ h[x) it is on average intractable to find x 
given y. Given a hash function h where x ^ h{y), then y is the preimage of x. 

Next is a n + 1 element hash chain: 

h{vi) h{v2) h{v3) h{v„-i) h{vn) 
Vo < Vl < V2 < ■■■ < Vn-1 ^ Vn- (1) 

Since the hash function is one-way, this hash chain must be initially generated from right 
to left. 

Definition 1 Consider a hash chain as described by EquationUl The value Vn is the seed of 
the hash chain. 

Generating the hash values forward in this order Vn,Vn-i, ■ ■ ■ ,vq is hash chain traversal. 
Computing hash elements backward in this order vq,vi, - ■ ■ ,Vn is preimage traversal. 

For efficient hash chain traversal see [13j . We always assume the hash function is well known. 
Using deferred disclosure of hash elements backwards validates prior knowledge of elements in 
the hash chain. In time step 0, given vo, then waiting to time step 1 one can verify that 
vq = h(vi) indicating with high probability vq and vi are from the same source. 

At time step i, our scheme inputs a chunk of digital evidence Si. Let c{£i) be the compressed 
version of Si. Say c(Si) contains a modest number of fixed bits, for instance 160 bits [Ml [27] . 
For example, for the function c we can use a technique such as Merkle-Damgard construction 
of a collision-resistant compression function. Then Vi © c{Si) is the input to the hash function 
h giving: 

Vi-i ^ h{vi © c{Si)) 

and this process is continued in a carefully timed fashion to give an entire hash chain. The 
function '©' may be either xor or concatenation. 

If a hash chain is completely exposed an adversary has access to all elements of 

Ci = Vn,Vn-l,Vn-2, ' ' ' ,Vo 

then this hash chain's relative and carefully timed deferred disclosure based authenticity may 
be easily challenged. For instance, an attacker may take Vn-k and falsify the input Sn-k by 
changing it to S'^_^, then generate the incorrect hash chain: 

C2 = Vn,--- ,Vn-k,v'n_k-ir-- 

where v'j^_^_^ 7^ Vn~k-i, for all i : n — k > i > 0. 

Now, without proofs of identity or authenticity, which chain Ci or C2 represents the au- 
thenticated data is not clear. Of course, proofs of identity or authenticity are only as good as 
the systems checking them. 
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Given a hash chain with hash seed Vn and the last hash element vq. Suppose we only save 
Vn and Vq on a small constrained device to validate the data from n different times 

£n-i, ■ ■ ■ , So- 
li £i is the first piece of critical data in hash element Vi, then to validate vi, we must know 
Uj+i. But, the next time the same hash chain is validated, we take £i+i and validate it with 
Vi+2- As these validation steps are repeated each time investigators weigh the evidence, we will 
be performing a (slow and deliberate) preimage traversal of the hash chain. 

The hash chain forward traversal algorithm given in this paper extends Jakobsson's work 
by augmenting his hash chain preimage backward traversal algorithm |14j . See also, [?]. Our 
algorithm uses hash chains to validate forensics evidence by generating the hash chain elements 
forward 

online but only storing [log2 n\ hash elements for a hash chain of n total hash elements. The 
preimages of such a hash chain may then be output using Jakobsson's algorithm and validated by 
an evidence clerk. To best ensure the evidence's credence, it is best for the clerk to only compute 
the backward preimage traversal as far as necessary. Then, additional deferred disclosures may 
be performed as needed to re-authenticate the validity of the exposed hash elements. 

A difficulty is in validating a hash chain by backward preimage traversal in memory times 
computational complexity of less than 0{n) where n is the number of hashes required, see for 
example [H El El |^ [19] . 

This paper gives an online algorithm for generating a forward hash chain traversal while 
always storing pebbles to be used for backwards preimage traversal by Jakobsson's algorithm. 
The online algorithm grows a hash chain as requested to any length n, but never requires storing 
more than [logn] hash elements or pebbles, where n hash elements have been generated so far. 
This is important on memory constrained devices. For any n, the [log n] pebbles can be directly 
plugged into Jakobsson's algorithm to start backward preimage traversal for verification. 

If n is the number of hash elements stored, other recent methods applicable to emitting hash 
elements, double the size to 2n to store a single additional element -the {n + l)-st element. This 
is not acceptable here for several reasons: (1) our approach depends on precisely timed hash 
element generation and generating n more hash elements may cause a simple and constrained 
device to miss data collection; and (2) the constrained devices may not have the storage to hold 
n additional hash elements. 

1.2 Previous Work 

Secure audit logs applied to digital forensics were developed by Schneier and Kelsey [23|. This 
work assumes three machines: a trusted secure server T, an small machine who secure state is 
untrusted U for keeping audit logs and a sometimes trusted verifier V. The audit log machine 
W is a small constrained machine that is only occasionally securely connected to T. A machine 
lA is trusted until it is compromised. If it is compromised, then it cannot change its audit 
log or read audit elements before the compromise occurred. This system uses a hash chain to 
secure the audit logs on the untrusted machines. But, each untrusted machine deletes all but 
the last element in the hash chain as the audit log grows. Only T has the seed of hash chain 
for verification. Our online algorithm along with Jakobsson's can be applied to variations of 
Schneier and Kelsey's audit log system. This would save space and potentially allow numerous 
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hash chains to serve as audit logs. They close their audit log with a 'normalCloseMessage' to 
prevent an adversary from extending it illicitly. 

To our knowledge, all hash chain preimage traversal work to date assumes either (1) the 
hash chain is computed in advance, or (2) the length of the hash chain is known or a bound on 
the length is known in advance, or (3) given 2* hash elements, the hash chain may be doubled 
in size to 2*"^^ hash elements when the (2* + l)-st element is needed, for t a positive integer. 

The first case does not seem directly applicable to our digital forensics scheme. In the last 
two of these cases, there may be a good deal of excess unused memory. 

1.2.1 Time- Stamping 

Time-stamping is a significant area of research. Haber and Stornetta [Tl] gave time-stamping 
schemes using both hash chains and digital signatures. This is similar to the hash chain scheme 
used in this paper. The first method they give is a linking scheme using hash chains to maintain 
temporal integrity. All of their schemes depend on a trusted third party. The trusted third party 
provides a time-stamping service which applies a hash function and digitally signs it. Other 
related schemes and improvements were given in this paper as well as Bayer, Haber, Stornetta [2] 
and Haber and Stornetta [12]. Buldas, Laud, Lipmaa and Villemson [4| focus on 'relative 
temporal authentication' and give both time-stamping requirements and new algorithms suited 
to these requirements. Their methods require a trusted third party to provide time-stamping 
services. Ansper, Buldas, Saarepera, and Willemson pj discuss linkage based protocols (i.e., 
hash chain based protocols) compared to hash-and-sign time-stamping protocols. Building on 
work of Willemson [28], Lipmaa [18] gives efficient algorithms to traverse skewed trees. Our 
paper uses basic time-stamping by hash chains. Though the focus is on constrained devices and 
digital forensics. 

1.2.2 Hash Chain Traversal 

Our forward hash chain traversal is based on Jakobsson's backward hash chain preimage traver- 
sal work, see also Coppersmith and Jakobsson [7J. Jakbosson gives an asymptotically optimal 
algorithm to compute consecutive preimages of hash chains. His algorithm requires [log n] stor- 
age and [logn] hash evaluations per hash element output. This is assuming 0(n) preprocessing 
was used to build the hash chain of n elements. 

Coppersmith and Jakobsson [7] give an algorithm with amortized time-space product cost 
of about ^ log^ n per hash chain element. This is also assuming 0(n) preprocessing was used to 
build the hash chain of n elements. They also give the following lower bound: Computing any 
element of the hash chain in the worst-case requires a time-space trade-off TS > , where T 
is the number of invocations of the hash function and S is the number of stored hash elements. 

Sella [25] gives a general solution that applies k hash function evaluations to generate any 
element in a hash chain, while storing {k — l)n^^^^~^^ total hash elements. His algorithm initially 
stores hash elements that are at constant intervals of distance {k — l)n^^^''^^\ Kim |15lll6j gives 
algorithms that improve Sella's in saving up to half of the space Sella's algorithms use while 
keeping the same parametric space and hash evaluation costs. This means Kim's algorithm uses 
at most the same space as Jakobsson's algorithm. 

Ben-Amram and Petersen |3] give an algorithm for backing up in a singly linked list of 
length n in 0{n'^) time, for any e > 0. This requires 0{n) pre-processing of the linked list. 
Matias and Porat [19j give list and graph traversal data structures that allow efficient back- 
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ward list traversal and very efficient forward traversal. Their 'skeleton' data structures allow 
complete back traversals in 0{k'n}/^) amortized time given k elements in storage. Thus, storing 
k = O(logn) elements, their algorithm requires O(logi) amortized element evaluations for the 
i-th backward traversal, where i < n. They bring out several interesting applications beyond 
hash chains. If their data structures are built for lists of n elements, then to accommodate n + 1 
elements they build a new data structure (list synopsis) for 2n + 1 total elements, see subsection 
4.3 of the full version of |19j . 

1.3 Structure of this Paper 

In the remainder of the paper we develop the ideas behind our proposed approach. Section [2] 
gives our model. Section [3] reviews Jakobsson's algorithm in detail. We give proofs of correctness 
for this algorithm in the appendix. In order to prove the correctness of our algorithm we found 
it necessary to supply detailed proofs of Jakobsson's algorithm (which were not given directly in 
Jakobsson's paper). Section S] introduces the specifics of our online algorithm, discusses how it 
interfaces with Jakobsson's amortized algorithm, and gives a proof of correctness. In Section [5] 
we give some conclusions. 

2 Chains of Custody: Physical and Digital 

Next is a formal definition of a chain of custody. 

Definition 2 A chain of custody is a detailed account documenting the handling and access to 
evidence. 

We quote Colquitt [6', Page 484] on the purpose of a chain of custody: 

"The purpose, then, of establishing a chain of custody is to satisfy the court that it 
is reasonably probable that the exhibit is authentic and that no one has altered or 
tampered with the proffered physical exhibit." 

A chain of custody, as described in Definition [21 may sometimes be referred to as a classic 
chain of custody. Maintaining a chain of custody is a standard practice investigators use to 
inextricably link the evidence that ties a crime to the suspects. 

Digital evidence is often stored using a classical chain of custody. For example, documenting 
when a particular individual first picked up a disk drive with critical evidence, the state of the 
disk drive, to whom and when they transferred the drive, etc. 

Digital evidence is extraordinarily easy to copy. Using standard techniques, each copy of 
digital evidence is easily authenticated. Physical evidence is somewhat different [IH 120], Con- 
sider a crime committed by using a gun. Manufacturing a new .357 Magnum Ruger Blackhawk 
Flattop revolver with an identical 'look,' serial number, and bullet grooves, to one used in a 
crime is extraordinary work. Just finding experts to replicate such physical evidence would 
generally leave a substantial paper trail. Moreover, biological evidence generally also provides 
time frames. Thus, classic chains of custody often focus on the basic identification, authen- 
tication, uniqueness, and time. Digital data lacks such uniqueness and timing characteristics. 
That is, many copies may be made of digital evidence both for legitimate and illicit reasons. 
Furthermore, timestamps alone may not be sufficiently convincing. 
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Definition 3 A digital chain of custody is the information preserved about the data and its changes 
that shows specific data was in a particular state at a given date and time. 

A DCoC (Digital Chain of Custody) is a small constrained device for holding, authenticating, 
and verifying a digital chain of custody. 

Take a one-way hash function h. Suppose h has deterministic and tightly bounded time 
complexity. There are some hash functions with bounded time complexity that are already 
in use: the RSA SecurlD [23j as well as systems like TESLA [2T| \T2\ depend on timed hash 
functions. This paper goes one step further advocating extremely precisely timed hash functions. 

This paper is focused on several aspects of evidence maintenance that are related to time 
stamping, see for example j8]. One side of the challenge is to show a piece of evidence existed 
after a particular point in time. Using the Merkle-Damgard technique, the evidence may be 
combined with non-predictable, widely distributed, and time sensitive documents such as a 
newspaper or financial market dat43- 

Similar discussions to the next definition can be found in several places, for example |lH I27j. 

Definition 4 Data that is non-predictable, widely distributed, verifiably stored, and time sensitive 
is socially bound data. 

A critical issue is that socially bound and highly granular data are not common. In certain 
situations, socially bound data must be highly granular, for example in milliseconds. Thus, 
carefully timed hash chains may be used as a proxy for socially bound data. See also, timestamp 
linkages in 

2.0.1 The Adversary 

The adversary this paper assumes is either (1) an untrustworthy verifier; or (2) a general attack 
against known hash functions on a DCoC. 

Suppose our system computes an n element hash chain ,vo, and say n > t. An 

untrustworthy verifier may get an element of a hash chain vt along with the files of digital 
evidence E = £t,- " j^o- The adversary may illicitly modify the evidence to E' = £j., ■ ■ ■ , £q 
and compute a 'competing' hash chain using vt and E' . Thus, adding doubt to the validity of the 
real evidence. Provided t < n, and the one-way hash function h is not broken, we can validate 
the hash chain based on E and not E' by producing vt+i and showing that /i(ft+i ®c{£t+i)) = vt- 

The issue of a general breach of a hash function is primarily dealt with by following the 
forensics policy of having at least three different known hash functions for the data. This 
procedure is to ensure trust in the hash functions in case one of them is no longer trustworthy [5] . 

2.1 Overview of Our Approach 

This paper assumes a model consisting of at least three constrained devices (DCoCs or cards) 
each having a different hash function. These constrained devices may be interfaced using a USB 
2.0, for instance. In any case, these devices have a small processor as well as limited memory. 
They are tamper resistant. 

*As financial markets become more automated and distributed, it may be feasible for the trading volume to 
be granulated down to minute fractions of a second. For example, in 2004 the average NYSE trading day volume 
per second was about 1,500,000,000/(60 • 60 ■ 6.5) or more than 64,000 shares per second. See www.nyse.com. 
The volume is known down to the individual share. This highly granular data may be widely distributed. 
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2.1.1 Data Capture 

The DCoCs may each interface through separate USB 2.0 ports simultaneously to gather com- 
mon snapshots of critical system states and memory. Each DCoC may be physically secured 
and transported independently by a different member of law enforcement. We assume these 
devices will be mass produced so that each DCoC has unique authentication software. (They 
should also have unique physical identification.) Different members of law enforcement should 
handle each card making it less likely that all cards may be manipulated by a single member 
of law enforcement. Furthermore, from physical identification, the DCoCs may become part of 
the classical chains of custody. 

Finally, each of the DCoCs may periodically authenticate each other while capturing data. 
Records of these authentications can be included in the hashed values. This authentication 
should be zero-knowledge interactive proofs of identity such as the Feige-Fiat-Shamir proto- 
col [9]. Such a proof of identity is important in this application domain since these proofs of 
identity are not transitive. Thus, a fake DCoC cannot impersonate a real one by just mimicking 
its proof of identity. 

2.1.2 Forensic Data Verification 

The plaintext evidence and diversifying data has been stored as the files M in plain sight on 
storage such as an optical disk. Different states or snap-shots of the evidence will be periodically 
concatenated into timed inputs of the hash chain. Suppose the hash computations run at a fixed 
and known speed on tamper resistant hardware. Thus, computing a hash chain on this special 
hardware can be used to certify the initial data states at specified times, see also Haber and 
Stornetta [11]. 

The verification procedure is done by an evidence clerk and consists of the following steps: 

1. Connect all three DCoCs to a secured verification machine. Initially, each DCoC is au- 
thenticated to all other DCoCs using a zero-knowledge interactive proof of identity. 

2. The DCoCs only output hash elements backwards on an as-needed basis to verify the 
evidence under consideration. 

3. As hash chain elements are output, the verification must continue with each hash element 
step. If a DCoC cannot authenticate another device, then it will alert the evidence clerk 
or even shut down. 

4. The hash elements from a DCoC are sent out from left to right: 

/i"Kec(^n)), ec(^„_i)), h\vt(Bc{£t)), 

for n — 1 > t. Moreover, h'"'{vn ® c(£'„)) is verified by computing 

h{h''-\vn-l C(fn-l)) © c{Sn)), 

and so on. 
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2.1.3 Putting It All Together 



Given the tamper resistant hardware, just storing data with associated time stamps may quickly 
overflow a constrained device-especially if the time granularity is very fine. Thus, compressed 
representation of hash chains can be used to verify time stamps along with the authenticity of 
other constrained devices. 

It is possible to publicly post data from a system under forensic investigation by way of a 
proxy server. This data may be signed by a small device and posted to a public (or private and 
trusted) location. Small devices may have trouble signing large amounts of data due to their 
constraints. Furthermore, for very fine time granularity, fast and consistent network bandwidth 
may not be available. In some contexts, if data is captured before a formal investigation is 
initiated, then it may be important to keep the suspects unaware that data is being captured 
as potential evidence. 

A DCoC may run its hash chain until a forensic examiner or law enforcement officer carefully 
checks it into the physical chain of custody while noting the time in detail. This allows a forensic 
examiner to determine the time of the evidence by counting the exact number of hashes. A 
key motivator of our online algorithm is in proactive forensics the value of n is not known in 
advance. Where n is the number of time-steps recorded by the DCoCs. In the case of a digital 
chain of custody, it can not be known precisely how long it will take to get potential evidence 
to an evidence clerk. Or how long an evidence clerk will take to validate the data, etc. This 
precise time as well as the interval between hashes, computed by our algorithm, will have to be 
known in advance to fake a hash chain. 

The preprocessing phase of Jakobsson's amortized algorithm must know the value of n in 
advance of when it is run [T3]. Our new algorithm prepares the appropriate [logn] pebbles 
for Jakobsson's algorithm, for any n. During hash element generation, for any n the online 
algorithm never stores more than [logn] pebbles. Once the online algorithm has no more 
requests to generate new hash elements, then the amortized algorithm [14] may be immediately 
started. 

3 Jakobsson's Algorithm 

This section reviews Jakobsson's Algorithm [14]. The proofs of correctness are in the Appendix. 
Some of these proofs are used in the results of Section SI 

3.1 A Review of Jakobsson's Algorithm 

Given a hash chain with 16 elements, Jakobsson's algorithm is initialized as indicated in Table 
[TJ In this Table, hash element 16 is the seed of the hash chain. Where the preimages are exposed 
sequentially in the following order: 1, 2, ■ ■ ■ ,16. In general, the position of a hash element in a 
hash chain ranges from 1, • • • ,n where hash element 1 is the first element to be exposed in a 
pre-image traversal. Thus, hash element n is the seed of the hash chain. 



Element Position 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


13 


14 


15 


16 


Pebble Placement 




X 




X 








X 
















X 



Table 1: Initial pebble locations for Jakobsson's algorithm 
Suppose n = \H\ is the number of hash values in the entire hash chain. In this case, 
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Jakobsson's amortized algorithm only stores [logn] hash pebbles. 

Suppose n = 2^ for some integer k. Now, Jakobsson's algorithm [13] is given: 



3.1.1 Jakobsson's Setup. 

Compute the entire hash chain: uo,fi, • • • , Vn- 
For pebble p[j], where j : k = log2 n > j > 1: 



Furthermore, 



j5[j].Start_Increment <— 3x2^ 

p[j].Dest_Increment <— 2^~^^ 

Position ^ 2-' 

Destination <— 2^ 

Value ^ 



Current. Position ^ 
Current. Value ^ 



3.1.2 Jakobsson's Main. 

This algorithm in Figure [2] updates the pebbles and values. 

Using Jakobsson's amortized technique jl4], a hash chain of size n requires a total of k 
pebbles where k = [logn]. This amortized algorithm performs O(logn) hash applications 
per hash element output. Initially, in the setup phase, the pebbles store hash elements from 
hash-chain positions 2"*^, 2^, 2^, . . . , 2^, respectively. 



4 Online Output of Jakobsson's Pebbles for any n 

Jakobsson's amortized algorithm works to conserve both stored pebbles and hash evaluations. 
This allows his algorithm to verify hash values on small sensors. This algorithm assumes pre- 
processing where all hash elements are pre-computed, perhaps by a more powerful processor jl4j . 

Our aim with the online algorithm is to have a small constrained device that generates 
all requested hash elements. The online algorithm broadens the applicability of Jakobsson's 
amortized algorithm. In particular, the online algorithm generates all hash elements, but only 
stores [log n] pebbles at any one time. Where n is the total number of hash element generated 
so far. Every time a new hash element is generated, no additional hash evaluations must be 
performed. 

These [logn] hash pebbles are positioned so that at any point the online algorithm is 
no longer invoked, then Jakobsson's amortized algorithm can start to be run directly on the 
pebbles. Thus, we believe the amortized and online algorithms are complimentary and fit well 
together. 

In Jakobsson's notation each hash element keeps its index throughout the computation, 

Wo, • • • ,Vn- 
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Given an entire precomputed computed hash chain: 

vi,V2, - ■ ■ ,Vn, then compute the pebble positions &; auxiUary information. 

1. if Current. Position = n, then 

Stop 

else 

Current.Position <— 1 + Cur rent. Posit ion 
endif 

2. for j 1 to A; do 

if Position ^ Destination then 

2.1 Position ^ Position — 2 

2.2 p[j].Value ^ h{h{p[j]. YaAue)) 
endif 

endfor 

3. if Current.Position is odd then 

output /i(p[l]. Value) 

else 

output p[l] .Value 

3.1 Position ^ Position + p[l].Start_Increment 

3.2 p[l] .Destination ^ p[l] .Destination + p[l] .Dest_Increment 
if p[l]. Destination > n then 

p[l]. Position < hoo 

p[l]. Destination < \-oo 

else 

p[l]. Value <— FindValue 
endif 

Sort Pebbles by Position 
endif 

Figure 2: Jakobsson's Hash-Chain Pebble Position update 



This is because n is fixed, so all hash element numbers remain the same. The pebbles in 
Jakobsson's algorithm are re-numbered, according to the initial fixed hash element numbers, 
and sorted by their positions. 

The notation for the online algorithm uses hash element index notation that changes the 
index values over time. 

For instance, in Jakobsson's preimage traversal algorithm, suppose the initial n hash ele- 
ments are: 

Then, after the first element is verified, it is discarded and the following hash elements remain: 

Alternatively, since n is not fixed in our online algorithm, suppose the following list of 
elements has already been generated (with only the appropriate pebbles saved), 

Wi,--- ,Wn. 
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Columns Represent The Same Hash Link 
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Jakobsson's Algorithm (Link Exposure) 
Grey ceiis represent traversal of ceils from closest pebble. 



Figure 3: Jakobsson's Algorithm: Each row from top to bottom represents the state of a hash 
chain traversed by Jakobsson's algorithm. Each row represents a single hash element exposure. 
For each row, bold and underlined positions represent the location of pebbles where Position 
= Destination. Bold positions represent a temporary location for a pebble in transit only. 
When a pebble is reached and moved it is moved to the back of a grey area and moves to the 
front of the grey area or sequence of grey areas until it reaches its destination (and is underlined 
and in the next row). 

Then, generating another hash value wq ^ h{wi). This gives n + 1 total hash elements, thus 
for simplicity all hash indices are increased by 1. This gives the following renumbering, 

Wl,--- ,Wn+l- 

Take Vi, then it corresponds to wj, where 



j = i - (2'^ - totalHashElements), (2) 

where a is the total number of pebbles and totalHashElements is the number of hash elements 
constructed thus far, also totalHashElements < 2'^. Note, that when totalHashElements = 
2'^ , then we verify i = j hy Observation [1] in the appendix. 

Fact 1 Jakobsson's notation and our notation corresponds exactly when n = 2^ for some integer 
A; > 0. 

The online algorithm has two distinct, non-interweavable phases: The growth phase, and 
the exposure phase. 

4.1 The Growth Phase 

Next is the growth phase. 
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Columns Represent The Same Hash Link 
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Hash Growth Algorithm 



Figure 4: Online Algorithm: As before, each row represents a hash element exposure. For each 
row bold and underlined positions represent the location of pebbles. 

4.1.1 The Online Algorithm's Setup 

seed 

totalHashElements 
grow, value 
grow. pebble 
exponent 

4.1.2 The Online Algorithm's Init Pebble 

InitializePebble(p, j, grow. value) 
p[j].Move_Increment ^ 2^^^ 
p[j].Start_Increment ^3x2^ 
p[j].Dest_Increment <— 2^+^ 

Position ^ 2-' 

Destination <— 2^ 

Value <— grow. value 

Figure 5: Initialize Pebble 

4.1.3 The Online Algorithm's Main 

Figure [6] updates the pebbles and values. 

4.2 The Online Algorithm's Exposure Phase 

In this phase, only the setup is different from Jakobsson's original algorithm, because we are 
no longer assuming that exactly 2^ pebbles must be present. 



the initial hash value 
1 

seed 

1 
1 
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while not done growing hash chain do 
grow. value ^ /i(grow. value) 
totalHashElements <— totalHashElements + 1 

1. if totalHashElements — 1 = 2^ for some integer k>l, then 

exponent ^ exponent + 1 
for all pebbles / do 

Position ^ Position + 2exponent-i 

endfor 

j grow.pebble 
Create pebble p[j] 
InitializePebble(p, j, grow.value) 
]9[j].distancefromSeed ^ totalHashElements 
grow.pebble <— grow.pebble + 1 

2. else 

for all pebbles I do 

t ^ p[l].Move_Increment +p[/].distancefromSeed + 1 

3. if totalHashElements = t then 

3.1 .Value ^ grow.value 

3.2 p[/].distancefromSeed ^ totalHashElements 

Position .Position — p[Z].Move_Increment 

endif 
endfor 
endif 
endwhile 
j grow.pebble 

4. Create pebble p[j] where j ^ grow.pebble 
InitializePebble(f», j, seed) 
p[j].distancefromSeed <— 

Sort pebbles by Position 

Figure 6: Hash Chain Growth Algorithm 

4.2.1 The Online Algorithm's Setup 

For J <— 1 to [log n] do 



destination <— p [7] .position 



Furthermore, we must also initialize the next lines 



current. position ^ totalHashElements 
current. value ^ grow.value 

The online algorithm's Main function is performed exactly as Jakobsson's algorithm with 
the exception that current. position cannot simply be set to zero during the setup phase, but 
rather must be determined from the value of the first pebble in the sorted order. 
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4.3 Characteristics of the Hash Chain Growth Algorithm 

To prove the validity of the Hash Chain Growth Algorithm we will show: (1) at each step of the 
hash chain growth there are always enough pebbles, and (2) the pebbles are always properly 
placed such that Jackobsson's algorithm can immediately begin to run on the stored data from 
the generated hash chain. 

Recall, given n = 2*, for some integer t, then pebbles are placed in hash element positions 
t = logn, t - 1 = log(n/2), t - 2 = log(n/4), t = I. 

For any initial pebble p[j] the value Destination never decreases in Jakobsson's algo- 
rithm. Thus, to determine on which move back the initial pebble p[j] will be output, compute 
when 

p[j]. Destination > n 

first occurs. 

Assume n is a power of 2, and the definition of D{j, k) can be found in the appendix, and 
given a fixed j solve for the minimal k so that 

2^ + A; •2^+1 > 2i°sn 

2k > 2i°s"--'-l. 

For instance, for j = log(n/2), then k = 1 gives 

2k = 2 > 2^°S""(^°S""^) - 1 
> 2-1 = 1, 

which holds. 

Fact 2 Let n = 2^. In Jakobsson's algorithm, pebbles are eliminated in the reverse order of their 
original indices j = k — 1, ■ ■ ■ ,1 where we consider the seed pebble at j = k to never be discarded. 



Proof: This proof follows from Jakobsson's algorithm in Figure [21 By Fact 1 171 of the appendix, 
it must be that only the pebble in initial position j = k — 1 = log(n/2) is going to be eliminated 
immediately after Jakobsson's algorithm generates and emits n/2 hash elements. 

Now, suppose the hash elements are renumbered where 1 represents hash element n/2 + 1; 
2 represents hash element n/2 + 2, all the way to n/2 represents hash element n. Still n = 2^. 

By Fact [19] of the appendix, with this re-numbering, the remaining pebbles will be placed 
at pebble positions 

2^,2^, ••• , 2'=~\ 

where n/2 = 2^~^. The pebble in position p[k — 1] is the seed of the hash chain, so it never 
moves. By Corollary [1] also of the appendix, for each of these remaining pebbles. 

Position = Destination. 

However, for any initial pebble p[j] the value Destination never decreases in the al- 
gorithm of Figure [2 Thus, we need to determine what initial pebble p\j] is placed in position 
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2^~^ = n/4. This new position is one backward move beyond the n/2nd emitted hash element. 
Thus, the DestJncrement of this pebble must be 2^^"^ = n/4, so the initial pebble was p[k—2]. 
Furthermore, by Fact this pebble is the second pebble to be discarded. 

The proof is completed by induction. I 



The next facts follow from the online algorithm in Figure [H 

Fact 3 Consider the online hash growth algorithm (see Figure [5]). A new pebble is only created 
when totalHashElements is such that 

2^" + 1 = totalHashElements, 

for some k > I. 

Proof: This is a direct result of line 1 of the online hash growth algorithm. | 



Any time 2^ = totalHashElements— 1, then between now and when 2^~^ = totalHashElements— 
1, all of the pebble's positions must be shifted by 2^~^. This is done in Figure [6] before a new 
pebble is created. 

Fact 4 Consider the online hash growth algorithm (see Figure [5]). Each time an element is added 
to the chain, totalHashElements is incremented by one. 

Fact 5 Consider the online hash growth algorithm (see Figure E]). Each time a pebble is created it 
is initialized as, 

p[j].distancefromSeed ^ totalHashElements. 



This follows directly from section 1 of Figure [H 

Fact 6 Consider the online hash growth algorithm (see Figure [5]). After its initialization, the only 
time p[j].distancefromSeed changes is when pebble j is moved and p[j].distancefromSeed is 
always updated to maintain the correct distance from pebble p[j] to the hash chain seed. 

Proof: After p[j].distancefromSeed's initialization in section 1 of Figure El the value 
p[j].distancefromSeed only changes when the next assignment is made in line 3.2: 

p[j].distancefromSeed ^ totalHashElements 

and this occurs exactly when the condition in line 3 holds true since 

p[j].distancefromSeed + p[j].Move_Increment = totalHashElements, 
completing the proof. I 
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Fact 7 In order to reverse Jakobsson's algorithm pebbles should be assigned indices in ascending 
order. 



Proof: This holds by Fact [51 



Fact 8 For each pebble, to find the current position for the purposes of the following proof we can 
calculate, 

Position = totalHashElements — p[j].distancefromSeed. 



Proof: This holds because Position is a measure of how far from the front of the chain 
a pebble currently is. By Fact H] totalHashElements is always equal to the total number 
of elements in the chain and by Facts [5] and [8] distancefromSeed is always the distance 
the pebble is from the seed including the seed itself. Thus, we know that we can find how 
far from the front of the hash chain Position) a pebble is by totalHashElements — 
distancefromSeed. I 



Fact 9 Consider the online hash growth algorithm. The only time a pebble is moved is when 
totalHashElements = t where t <— p[j].Move_Increment + p[j]. distancefromSeed. When 
a pebble is moved, then it is moved to the front of the chain. 

Fact 10 When the online hash growth algorithm is halted the final pebble is placed at the root 
with the seed value, and the next available pebble index. 

Fact [To] follows immediately from the code starting at line 4 of Figure [H Jakobsson assumes 
n is a power of two for simplicity }14] . In the case of the online algorithm, values of n other 
than powers of two are also critical. 

Theorem 1 Suppose that the online algorithm in Figure [6] is halted after generating n total hash 
elements. The pebbles associated with the hash chain of size n will be stored at positions equivalent 
to the positions in Jakobsson's algorithm. 

Proof: The case where n = 2^ , the online and amortized algorithm both have pebbles in the 
same positions by Fact [H 

The proof follows by induction on the ranges from [2^^, 2'^+-'^) to [2^+^, 2^+^), for k > 2. 

Basis: Take a hash chain with k = 2, thus n = 2^ = 4 elements as stored by the online by the 
algorithm in Figure [6) 

Position = 2 
p[l].Move_Increment = 4. 
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As the hash chain is being grown after n = 4:, the first occurrence of CreatePebble is at 
2^ + 1 = 5. This pebble is placed at the front of the chain by Fact [5] and labeled p[2] by Fact [71 
Furthermore, 

p[2]. Position = 4 
p[2].Move_Increment = 8. 

since grow. pebble = 2. Suppose the hash chain growth is halted at n = 5. Due to Fact IIUI we 
know that the final pebble is placed for the seed. The pebbles are sorted by their Positions. 

Growing the hash chain to n = 6, prompts the online hash growth algorithm to check each 
pebble for totalHashElements = t = 6 where, 

t <— p[j].Move_Increment + p[j].distancefromSeed. 

But, totalHashElements ^ t for all j and n = 6. 
The algorithm finds tj where 

ti ^ p[l].Move_Increment + p[l].distancefromSeed 
*2 ^ p[2].Move_Increment + p[2].distancefromSeed. 

At n = 5 no pebble movement occurs. At n = 6 = the algorithm sets 

p[l].distancefromSeed = p[l].distancefromSeed + p[l].Move_Increment, 

such that 

p[l].distancefromSeed = totalHashElements = 6. 

Suppose the hash chain growth is halted at n = 7. Again, due to Fact IIUI we know that the final 
pebble is the seed. Thus, for n = 7 we have pebbles at Position = 2, p[2]. Position = 4 
and p[3]. Position = 8 by Fact El 

Inductive Hypothesis: The statement of this Lemma holds for hash chains from n = 2^ to 
n = 2'^+^, for some k >2. 

Inductive Step: Consider the case where n is increased from n = 2^^ to n = 2*^"'"^ where k > 2. 
By the inductive hypothesis we suppose the pebbles are all placed in the correct positions for 
n < 2^. 

By Jakobsson's algorithm, for a chain of original size n we know that a pebble is only 
discarded when the total number of remaining elements is n/2. Thus, the hash chain growth 
algorithm only adds new elements when the total number of pebbles is (n/2) + 1. 

Recall, a pebble is always placed at the seed starring at line 4 of the online algorithm, this 
assures we always have the proper number of pebbles. 

By Fact HHwe know that in Jakobsson's algorithm only pebble p[l] is moved. Further, it 
is always moved a total of ]5[l].Dest_Increment after being moved such that it reaches its 
destination. Since 

p[i].Dest_Increment = p[f].Move_Increment 

and because pebbles are always moved to the front by the online algorithm by Fact [H then we 
know that pebbles are always moved correctly. 
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For each n, there are always enough pebbles to support the total number of elements emitted 
in the online algorithm. As just shown, these pebbles are always placed correctly completing 
the proof. I 



5 Further Directions 

It would be interesting to extend the algorithms of Sella [25j or Matias and Porat [19j to work 
online. That is, suppose we initially have a data structure capable of holding n points of data, 
then it would be interesting to be able to generate n + 1 elements of data and expand this data 
structure suitably, i.e. not doubling it in size. 

We have not addressed the tamper-resistant hardware. Likewise, for very small time gran- 
ularity, we have not addressed clock drift and related timing issues. 
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Appendix: A Proof of Correctness for Jakobsson's Algorithm 



This appendix gives a correctness proof for Jakobsson's amortized algorithm. This is done to 
derive the correctness of the new onhne algorithm given in this paper. Jakobsson's paper [T3] 
does not give a complete proof of correctness for his algorithm. However, our online result 
seems to require certain factual statements from the correctness of Jakobsson's algorithm. 

Ultimately, what we show when Jakobsson's algorithm runs emitting njl = 2^ hash elements 
the remaining pebbles contain hash elements of hash-chain positions 2^, 2^, 2^, . . . , 2'^ which will 
prove the correctness of Jakobsson's algorithm. 

Jakobsson's work immediately gives the next observation. 

Observation 1 Suppose a hash chain has n hash elements and n = 2^ , for some integer k >\ and 
Jakobsson's algorithm is about to start. The k pebbles initially contain hash elements in hash-chain 
positions 

2^2^2^•■■ ,2^. 

Say Jakobsson's algorithm is about to start on a hash chain of size 2n. Then /c + 1 pebbles initially 
contain hash elements in hash-chain positions 

2^,2^,2^,- ■■ ,2^2'=+^ 



A proof of Observation [T] follows directly from Jakobsson's setup phase where the pebbles 
are initialized. 

Definition 5 A pebble p[j\ is moved backward if Position is increased in line 3.1 of the 
algorithm in FigureEl A pebble is moved forward if Position is decreased by 2 in line 2.1 
of this algorithm. 

Fact 11 Only pebble p[l] moves backwards in Jakobsson's algorithm. Furthermore, it only does so 
when Current. position is even. 

Proof: This holds since, the major else in section 3 of Figure [2] is the only place Jakobsson's 
algorithm increases any pebble position. I 



Fact 12 A pebble p[j] moves two hash elements forward in lines 2.1 and 2.2 of Jakobsson's algo- 
rithm only after p\j] was moved previously from the front of the chain by Fact [TTl 

Proof: In code section 2, making Position ^ Destination may only be done by 
code in section 3 of Figure [2j I 



Recall that Jakobsson's algorithm initially sets Position = Destination, for all 
pebbles j ■ cr > j > 1. Where a is the number of pebbles. 
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Fact 13 For any pebble p[j], before or after any invocation of Jakobsson's algorithm in Figure [2] 
the bound holds: 

Position > Destination, 
for all j : cr > j > 1 where Destination ^ +oo. 

Proof: For each pebble j > I, the values StartJncrement = 3x2-' and DestJncrement = 

2^~^^ are both even and never change. Initially in the setup phase 

Destination = 2^ 
Position = 2\ 

In section 3, the bound Position > p[j]. Destination is always enforced when Cur- 
rent. position is even, due to the values of StartJncrement and DestJncrement and also 
the assignments on lines 3.1 and 3.2, 

Position ^ Position + Start Jncrement 
Destination <— Destination + DestJncrement. 

However, in section 2 of Figure [21 if Destination ^ Position, then Position 
is decremented as follows: 

Position <— Position — 2. 

By Fact \T2\ during a run of Jakobsson's algorithm, both 

Position and Destination 

remain even values since lines 3.1 and 3.2 add even values to even values. 
Line 3.2 ensures 

p[l]. Position > p[l]. Destination 

and line 2.1 decrements Position by 2, spanning all even numbers going down. Thus 
Position must eventually equal Destination, completing the proof. | 

k 

Definition 6 The expression p[j — > i] indicates the pebble initially setup in position j is currently 

k 

in position i after k backward moves. The expression p[j — > i] indicates this pebble was previously 
in position j, but j is not necessarily the initial position of this pebble in setup. 

The expression p[j 1] indicates a pebble that was in position j becomes p[l] without 
moving backwards. That is, the pebble labeled p[j] becomes p[l] after emission of j — 1 hash 
elements. 

k 

Note, p[i 1] indicates that the pebble initially labeled p[j] has become pebble p[l] exactly k 

times. Also, p[l j] indicates that the unique pebble initially labeled p[l] is moved backwards 

exactly once and eventually relabeled pebble p[j]. Where p[l j] is a pebble that was earlier 
in position 1 and moved back once, but perhaps it was in another position initially at setup. 
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k 

Definition 7 Let D{j,k) be the value of p[j —>■ i]. Destination immediately after k backward 
moves of the pebble initially setup as pebble and became p[i], by the Algorithm in Figure [2l 

Thus, by Fact [HI each backward move is initiated when the initial p[j] has become p[l]- 
Next is a bound on D{j, k), for the initial pebble p[j]: 



D{j, k) 



2i + 2^+1 if = 1 

D{j, k-l) + 2^+^ otherwise. 



k 

Definition 8 Let P{j,k) be an upper-bound on the value of p[j —>■ i]. Position immediately after 
k backward moves of the pebble initially setup as pebble p[j] and became p[i], by the Algorithm in 
Figured 

Next is a bound on P(j, k), for the initial pebble p[j]: 



P{j,k) < 



2^ + 3 • 2-' if = 1 

P{j, A; — 1) + 3 • 2-' otherwise. 



Fact 14 Immediately after moving pebble backwards top[i], the next upper bound holds: 

p[j A i]. Position - p[i f]. Destination < 2^ + k {^3 ■ 2^) - 2^ - k {2 ■ 2^) 

< k-2\ 

k 

Equality holds immediately after p[j — s- i] is placed in position i. 

This fact holds by Fact [HI 
Fact 15 Consider the Algorithm in Figure [2l Then for every pebble p[j I], it must be that 

k k 

p[i — > 1]. Position = p[} —>■ 1]. Destination, 

k k 

where neither p[j 1]. Position = -|-cxd nor p[j — > 1]. Destination = +00. 

Proof: Consider the next function that describes how Destination works. Note, that 
line 3.2 of Figure [2] is the only place the Destination field is changed. 



Basis: It must be that 



p[j 1]. Position = p[j 1]. Destination, 



since by the upper-bound in Fact [U 

p[j 1]. Position — p[i 1]. Destination < 2^. 

However, D{j,l) = 2^ + 2^~^^ and by line 2.1 of the algorithm, we must decrement p[j 
1]. Position by 2 each of D{j, 1) times. This is because the pebbles are sorted by Position. So 
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immediately after the initial pebble p[j] is moved back, then its destination is D{j, 1). Thus, 
after D{j, 1) more hash elements are emitted, then this pebble will be labeled p[l]- 

Thus, since 2D{j, 1) > 2^ by Fact [HI 

p[j 1]. Position = p\j 1]. Destination, 

completing the basis. 

Inductive Hypothesis: Suppose in A; > 1 backward moves of the initial pebble p[j] by the 
Algorithm in Figure [21 we have 

k . , k 

p[i 1]. Position = p[i 1]. Destination. 

Inductive Step: For some k > 1, consider k + 1 backward moves of the initial pebble p[j'] by 
Algorithm in Figure [21 and this step will show: 

— ^ 1]. Position = p[}' 1]. Destination. 

Start by considering k + 1 backward moves of the initial pebble p[j'], for the sake of a 
contradiction suppose, 

p[j' J 1]. Position / p[j' j 1]. Destination, 

and in any case, by Fact UM. the next bound holds: 

p[i' —^ j]. Position — p[i' j]. Destination < 2^ . 

However D{j', 1) = 2^'+^ + 2^", which means after 2^ + 2^ hash elements are emitted to 
take the initial pebble p[j'] to p[l] the first time, then 

p[j j]. Position 

must be decremented by 2 in each run of section 2 of Figure [2] decreasing 1]. Position 

by at least 2D{j',l) > 2^'. 

Now, by the Inductive Hypothesis it must be that, 

k k 

p[j — > 1]. Position = p[j 1]. Destination. 

giving, 

p[j' 1]. Position = p[j' 1]. Destination, 

indicating our supposition was incorrect, which completes the proof. I 



Fact 16 Suppose a hash chain starts with n = 2^^^ total hash elements, for some integer k > I 
and Jakobsson's algorithm runs emitting n/2 = 2^^ hash elements. Then the pebble initially in 
position j < is moved backwards a total of 

2k-j-l 

times. 
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Proof: By Fact [TT\ a pebble only moves back when it is in pebble position 1. By Fact [T3l 

Position > Destination, 

so the only concern is how Destination is initialized and how Destination changes. 
After n/2 hash elements are emitted, it takes exactly 



.2 / p[j].Dest_Increment V / 2-?+i 

- 2^-^-1 - - 
2' 

increments of Destination by the time the pebble initially at position j is disposed of. 



A pebble is disposed of when Destination < hoo. In terms of Fact [TBI t^is pebble in 

position /c + 1 is never moved back and the pebble in position k is disposed of when it is moved 
back the first time. 

Fact 17 Given an n = 2^ element hash chain and take the Algorithm in Figure [2l The initial 
pebble in position j, for j = k — 1 = log(n/2) is the only pebble that is disposed of by the time the 
first n/2 hash elements are emitted. 

Proof: For each pebble, the values StartJncrement and DestJncrement never change. 
Initially in setup Destination ^ 2-', for all j : logn > j > 1. 

The value Destination is only increased in line 3.2 of Figure [21 Considering line 3.2 
gives the bound 

Destination = 2^ + DestJncrement 
Destination = 2-'' + 2-''+^ 
Destination = 3-2^. 

Pebble p[l] is the only one that is moved back by Fact [TTl Further, 

3 • 2^' > n, 

holds for j > log(n/2). But the only mobile pebble that satisfies this constraint is initially in 
position j = log(n/2) = k — 1. There is a pebble j = logn at the seed of the hash chain, but 
this never moves. 

Now, take any initial pebble that started in position j < log(n/4) then by Fact [TH such a 
pebble makes 2'^~^~^ backward moves by the time n/2 hash elements are emitted. Thus, by the 
time the n/2-nd pebble is emitted, the initial pebble p[j] is moved from the starting position 
2^ to position 

2^ + (2^+^ ■2''-^-^'^ = 2^■ + 2^ 
and the largest j and k may be is log(n/4) and log(n/2), respectively, giving 



2-? + 2^^ < 2'°^"/^ + 2^°^"/^ 
3n 

< — < n, 
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completing the proof. 



Fact 18 Suppose a hash chain starts with n = 2^~^^ total hash elements, and for some integer 
k>\, while Jakobsson's algorithm runs emitting the first n/2 = 2^ hash elements, let i be the first 
placement of p[j i] between hash elements 2^ and 2^+^. Then, Jakobsson's algorithm will emit 
a total of 

^ - [2^-i-^ - 2^+1 - 2^ 
hash elements before emitting hash element 2^ but immediately after p[j — > i] is placed at position i. 



Proof: After Jakobsson's algorithm emits hash element 2^ , a total of n/2 total hash elements 
have already been emitted. 

By Fact 1161 the pebble initially at position j < A; is moved backwards 2^~^^^ times. That 
is, it appears in pebble position 1, a total of 2^~^^^ times. 

The last time this pebble moves backward to position i is the first time it moves to a position 
behind hash position 2^. This gives the factor of {2^-^-^ - 1). 

Finally, for initial pebble j, since DestJncrement is always 2-'"''^, each time we move 
backwards in Jakobsson's algorithm, we move Destination back by 2^'^'^ elements. That is, we 
move back a total of 

^k~j~i _ 2^+1 

hash elements by the time the pebble initially labeled j passes hash position 2^ . 

Furthermore, the initially labeled jth pebble starts at position 2-', so we adjust for the 
number of hash elements emitted it takes to traverse p[j -^1]. I 

Corollary 1 We know that when n/2 hash elements are emitted that all remaining pebbles will 
have 

Destination = Position. 



Proof: We know by Fact [181 that there are 

elements that remain to be output before a total of n/2 elements are emitted after each pebble 
at original pebble index j is moved for the last time before n/2 elements are emitted. 

By Jakobsson's algorithm we know that each time a pebble at original position j is moved, 
it will take, 

(3 • 2^ - 2^+1 )/2 
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hash exposures for Position = Destination. 

Thus, since 



n 
2 



2fc-i-i _ 2^+1 - 2^ > (3 • 2^ - 2^+i)/2 



we know that when n/2 hash elements are emitted that all remaining pebbles will have Position 
Destination. Thus, completing the proof. I 



The next fact, can also serve as an alternate proof-of-correctness for Jakobsson's algo- 
rithm [H]. It is presented here, since it is needed subsequently. 

First, the function reindexes the pebble indices from one power of two down to the next 
lowest power of two. When n/2 pebbles are emitted, then the total number of remaining hash 
elements goes from n = 2^~^^ down to n = 2^^. 

Rk{h+i) = h+i — 2^^, 

where Ik+i is the hash element > 2^'^^ contained by pebble index k + 1, and Rk{lk+i) is the 
adjusted index of the hash element > 2^. 

Fact 19 Suppose a hash chain starts with n = 2^+^ total hash elements, for some integer k > 1 
and Jakobsson's algorithm runs emitting n/2 = 2^ hash elements. Then the remaining pebbles 
contain hash elements of hash-chain positions 

2\22,2^--- ,2^. 



Proof: This proof follows by induction on the algorithm in Figure [21 Setting k = 2 ensures 
sections 1,2 and 3 of Jakobsson's algorithm will run (see Figure [2]). The base case will be A; = 2, 
so n = 2'^ = 8, exposing n/2 elements to get k — 1, n = 2'^"^ total elements. 

Basis: Take a hash chain with n = 2^ total hash elements. Thus, n = 2^~^^, so k = 2. Then the 
setup phase of Jakobsson's algorithm assigns pebbles to hold hash elements in positions 2^,2^ 
and 2^. 

Now, suppose n/2 = 4 hash elements are exposed, thus the algorithm in Figure [2] is called 
4 times. 

By Fact [17] only one pebble, the pebble associated with the 2^ position will be discarded. 
This leaves 2^=+^ -1 = 2 pebbles remaining. One of these pebbles is associated with the root 
and will not change. For the remaining pebble associated with the 2^ position 



P(j, 1) = 2^ + 3 • 2^' = 2^ -h 3 • 2^ = 8 
D{j, 1) = 2^ + 2^+^ = 2^ + 2^ = 6. 

Now, n = 2^^ = 4 elements remain. In order to put position /, which is in terms of base 
2^^^ hash chain, in terms of the base 2^^ hash chain we have our two remaining pebbles at 
i?fc=2(6g=3) = 2, and i?fe=2(8g=3) = 4, where g = k + 1. Since k = 2 our remaining pebbles are 
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at positions 2^ and 2^ or, 2^ . . . 2^^. 

Inductive Hypothesis: Suppose that for a hash chain of size 2^'^^ where k > 2 after Jakobs- 
son's algorithm runs emitting n/2 = 2^ hash elements that the remaining pebbles contain hash 
elements of hash chain positions 



2\2^2^...,2''. 

Inductive Step: Consider n/2 hash element exposures for a hash chain with n = 2^~^^ total 
elements where k >2. This step will show that the pebbles will be placed at positions equivalent 
to 2i,22,...,2^. 

Again, by Fact [T7] only one pebble, the pebble associated with the 2^ position will be 
discarded. This leaves k — 1 pebbles remaining. One of these pebbles is associated with the 
root and will not change. 

By Fact 1151 for all pebbles p[j ^ 1], it must be that 

k k 

p[i —>■ 1]. Position = p[j 1]. Destination. 

For all remaining pebbles after n/2 hash exposures, 

Position = P{j,q), 
Destination = D{j,q). 

By Corollary [1] we know that for all remaining pebbles Position = Destination 
when n/2 elements are output and thus all pebbles are inactive. 

Further, since we know that Position = Destination the remaining pebble posi- 
tions are defined by the destination function D(j, m) as defined earlier where m is the number 
of times the pebble at original position j is moved back and 

D{i,m) = 2^ +m- 2-''+\ 

By Fact [16] m = 2'^"-'"^ for each pebble at initial position j over the first n/2 hash element 
exposures. 
Therefore, 

D{j, 2^-^-^) = 2^ + 2^-^'^ ■ 2-'+i 
D{j,2^~^-^) = 2^ + 2^ 

given in terms of base n = 2^~^^ total elements. To put this result in terms of our new hash 
chain n = 2^ we simply apply the Rk{lk+i) function to get 

Rk{2^ + 2'') = 2^ + 2^= - 2^ = 2^ 

where j = 1, • • • ,k. 

Thus, by the inductive hypothesis it must be that when Jakobsson's algorithm runs emitting 
n/2 = 2^ hash elements the remaining pebbles contain hash elements of hash-chain positions 
2^, 2^, 2^, . . . , 2'^, thus completing the proof. I 
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